Installing packages

If you don’t already have these packages installed, run this code

install.packages("plotly")
install.packages("ggplot2")
install.packages("dplyr")

Run this code, Required pacakges for activity

library(plotly)
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Introduction

Data visualization is a very important tool in the field of Data Science. It helps people share the findings of there research/analysis with a wide variety of people from different backgrounds through effective visuals. At Macalester, the most common data viz.package we are taught in R is ggplot. This package allows us to input a data frame and turn it into a graphic where the user can specify certain aesthetic parameters to create a desired visualization.

For this activity, we will be building upon our knowledge of data visualization and learn a new skill called plotly, “an Interactive web-based data visualization” that can be used in R and python. This package allows us to take in data and turn it into a interactive visualization that can enhance the message of the analysis. Through the completion of this activity and reflection points, the hope is that you will learn a new skill that you can add to your bag of tricks.

Before begining this activity, please look through this article that covers the basics and syntax of plotting with plotly. Throughout this activity, if anything is unclear, please look back at this reference for code help. - https://plotly-r.com/

Section 1: Basics

For this activity, we will be using the mtcars data set that is built into r.

glimpse(mtcars)
## Rows: 32
## Columns: 11
## $ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,…
## $ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,…
## $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16…
## $ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180…
## $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,…
## $ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.…
## $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18…
## $ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,…
## $ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,…
## $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,…
## $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,…

As mentioned above, we have commonly been taught to use ggplot to make visualizations, tt is an effective package that allows users a vast amount of options. Included below is a scatter plot using ggplot and the mtcars dataset.

ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point()+
  labs(
    x = "Weight (X thousand lbs)", y = "Miles per gallon",
    title = "Fuel effiency by weight", color = "Cylinders") +
  theme_minimal()

reflection

Answer here:

Section 1, part b

Just like ggplot, Plotly allows us to create visualizations, but with slightly different formatting. Below are several examples of common graph types—this time using Plotly.

As you run each cell, interact with the plots:

Try:

Scatter plot

plot_ly(
  data = mtcars, x = ~wt, y = ~mpg, type = "scatter", mode = "markers") %>%
  layout(title = "Scatter Plot: Weight vs. MPG", xaxis = list(title = "Weight (thousand lbs)"), yaxis = list(title = "Miles per Gallon"))

Histogram

plot_ly(
  data = mtcars, x = ~factor(cyl), type = "histogram") %>%
  layout( title = "Histogram of Cylinder Count", xaxis = list(title = "Number of Cylinders"), yaxis = list(title = "Count of Cars"))

Line chart

mtcars_sorted <- mtcars %>% arrange(hp)

plot_ly(
  data = mtcars_sorted, x = ~hp, y = ~mpg, type = "scatter", mode = "lines") %>%
  layout(title = "Line Plot: MPG Across Increasing Horsepower", xaxis = list(title = "Horsepower"), yaxis = list(title = "Miles per Gallon"))

3d plot

  • This is a very cool one!
plot_ly(
  data = mtcars, x = ~wt, y = ~mpg, z = ~hp, color = ~factor(cyl), type = "scatter3d", mode = "markers") %>%
  layout(
    title = "3D Scatter Plot: Weight, MPG, and Horsepower", scene = list( xaxis = list(title = "Weight (thousand lbs)"), yaxis = list(title = "Miles per Gallon"), zaxis = list(title = "Horsepower")),
    legend = list(title = list(text = "Cylinders")))

reflection

  • How does interactivity (hover, zoom, filtering) change the way you understand the information in these graphs compared to static ggplot visuals?

  • what types of data scenarios do you think using Plotly would add meaningful value vs. when a static ggplot might be more appropriate?

Answer here:

Section 2: Making the graph interactive

Returning to the original ggplot scatterplot, there are two ways to convert it into an interactive Plotly visualization.

Approach 1. using ggplot again and letting plotly handle it. As you can see, it is the same code we used with the ggplot calls which creates this great interactive. If you hover over the points, you are able to see the what the points axis points are.

p <- ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point() +
  labs(
    x = "Weight (X thousand lbs)",
    y = "Miles per gallon",
    title = "Fuel efficiency by weight",
    color = "Cylinders"
  ) +
  theme_minimal()

# ggplotly(p)

Approach 2. This approach uses Plotly syntax from the start, giving more control over features like hover labels, legends, colors, and marker styling.

plot_ly(
  data = mtcars,
  x = ~wt,
  y = ~mpg,
  color = ~factor(cyl),
  colors = "Set1",
  type = "scatter",
  mode = "markers",
  marker = list(size = 10)
) %>%
  layout(
    title = "Fuel efficiency by weight",
    xaxis = list(title = "Weight (X thousand lbs)"),
    yaxis = list(title = "Miles per gallon"),
    legend = list(title = list(text = "Cylinders"))
  )

This activity only introduces the basics of Plotly, but the package offers many additional tools and visualization types. We encourage you to explore further and try out different interactive features and plot options.

Key Takeaways: - Plotly uses similar concepts as ggplot but with different syntax

  • Basic plots can be created with few lines of code (its pretty easy to make it interactive!)

  • Interactivity makes it different from ggplot(hovering, zooming, and rotating to help make patterns more clear)

  • Plotly is most ideal when you want the user to explore and learn on there own a bit

Section 3: Adding Hover Labels, Color, and Customization

So far, we’ve used Plotly to make basic interactive plots where you can zoom and hover. But the default hover labels and colors aren’t always the most helpful.

Plotly lets you:

  • Customize hover labels: decide exactly which variables appear when you hover over a point, and how they’re formatted (e.g., “Weight: 2.5, MPG: 22”).

  • Control color: map color to a variable (like number of cylinders) and choose a color palette that makes groups easy to compare.

  • Style markers: change size, opacity, and sometimes symbol, which helps highlight important points or avoid overplotting.

This combination makes your graph feel more like a little “data app” — each hover tells a small story about that specific car.


Here’s a customized scatterplot using mtcars. We’ll:

  • Map color to cyl (number of cylinders)
  • Add extra info to the hover text (horsepower + gears)
  • Make markers bigger and slightly transparent
plot_ly(
  data = mtcars,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  color = ~factor(cyl),
  colors = "Dark2",
  marker = list(size = 12, opacity = 0.8),
  text = ~paste(
    "Model:", rownames(mtcars),
    "<br>Weight:", wt,
    "<br>MPG:", mpg,
    "<br>Horsepower:", hp,
    "<br>Gears:", gear
  ),
  hoverinfo = "text"
) %>%
  layout(
    title = "Fuel Efficiency with Custom Hover Labels",
    xaxis = list(title = "Weight (thousand lbs)"),
    yaxis = list(title = "Miles per Gallon"),
    legend = list(title = list(text = "Cylinders"))
  )

When you hover over points now, you don’t just see numbers — you get a mini “profile” of each car.


Exercise

Goal: Practice customizing hover labels and styling so the plot tells a clearer story.

  1. Start from the example above (you can copy/paste it).

  2. Make the following three changes:

    • Add at least one different variable to the hover text (e.g., qsec or carb).
    • Change either the color palette or which variable is used for color.
    • Change at least one marker property: size, opacity, or symbol.

Use this chunk as your starting point:

# Exercise: Customize hover labels and styling

plot_ly(
  data = mtcars,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  # TODO: choose your color mapping and palette
  color = ~factor(cyl),
  colors = "Set1",
  marker = list(
    size = 10,      # you can change this
    opacity = 0.9   # and this
  ),
  text = ~paste(
    # TODO: customize your hover text
    "Model:", rownames(mtcars),
    "<br>Weight:", wt,
    "<br>MPG:", mpg
  ),
  hoverinfo = "text"
) %>%
  layout(
    title = "Your Customized Interactive Plot",
    xaxis = list(title = "Weight (thousand lbs)"),
    yaxis = list(title = "Miles per Gallon")
  )

Short reflection

  • What did you change?
  • Which hover fields felt most useful to you, and why?

Answer here:


Section 5: Filter Function & Zooming

Interactivity isn’t just about pretty hover labels — it’s also about controlling which data you see and how closely you look at it.

There are two main ideas here:

  1. Filtering the data before plotting Using dplyr::filter(), you can focus on a subset of the data (e.g., only cars with high MPG, or only 4-cylinder cars). This makes your interactive plot more targeted.

  2. Zooming and panning inside Plotly Once the plot is rendered, you can:

    • Click and drag to zoom into a region.
    • Use toolbar buttons to pan around, zoom in/out, or reset.
    • This helps you explore patterns that are hard to see when everything is crammed together.

Combining filtering + zooming lets you move between “big picture” and “close-up” views of your dataset.


Example A: Filter to only 4-cylinder cars

mtcars_4cyl <- mtcars %>%
  filter(cyl == 4)

plot_ly(
  data = mtcars_4cyl,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  marker = list(size = 10),
  color = ~factor(gear)
) %>%
  layout(
    title = "Filtered Plot: Only 4-Cylinder Cars",
    xaxis = list(title = "Weight"),
    yaxis = list(title = "Miles per Gallon"),
    legend = list(title = list(text = "Gears"))
  )

Example B: Zooming on an unfiltered plot

plot_ly(
  data = mtcars,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  color = ~factor(cyl)
) %>%
  layout(
    title = "Try Zooming and Panning (Drag to zoom, double-click to reset)",
    xaxis = list(title = "Weight"),
    yaxis = list(title = "Miles per Gallon")
  )

Try: click-and-drag to zoom into a cluster of points, then double-click to reset.


Exercise

Goal: Use both filtering and zooming to explore a subset of cars more deeply.

  1. Use dplyr::filter() to pick one subset of interest, such as:

    • Cars with mpg > 25
    • Cars with hp > 150
    • Only 6-cylinder cars (cyl == 6)
    • Only manual transmission cars (am == 1)
  2. Make an interactive scatterplot of wt vs mpg for that subset.

  3. Interact with it:

    • Zoom into a region.
    • Hover over several points.
  4. Write 1–2 sentences about something you saw that you might not have noticed in the full dataset.

Scaffolded code:

# Exercise: Filter + zoom exploration

# 1. Filter the dataset (change this line to your own filter condition)
mtcars_subset <- mtcars %>%
  filter(mpg > 25)   # <-- edit this condition

# 2. Make an interactive scatterplot
plot_ly(
  data = mtcars_subset,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  color = ~factor(cyl),
  marker = list(size = 10)
) %>%
  layout(
    title = "Filtered & Zoomable Plot",
    xaxis = list(title = "Weight"),
    yaxis = list(title = "Miles per Gallon")
  )
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

Underneath, have them answer:

Reflection:

  • What filter did you apply (in words)?
  • After zooming and hovering, what did you notice about this subset?

Answer here:


Section 6: Design Your Own Plotly Visualization

Goal: Now it’s your turn to design your own interactive visualization!

Task:

  1. Pick a dataset and a question that interest you.

Option A: keep using mtcars

#your_data <- mtcars

Option B: replace with your own dataset, e.g.:

#your_data <- readr::read_csv("my_data.csv")
  1. Choose which variables to map on x, y, and color
# TODO: change these to variables that make sense for your question
x_var <- ~wt 
y_var <- ~mpg 
color_var <- ~factor(cyl)

plot_ly(
data = your_data,
x = x_var,
y = y_var,
type = "___",
mode = "___",
color = color_var,
marker = list() 
%>%
layout(
title = "_______",
xaxis = list(title = "______"),
yaxis = list(title = "______")
)

Reflection:

  • What question were you trying to explore?
  • What dataset and variables did you choose, and why?
  • Did the interactive features (hover, color, zoom, etc.) help you see anything you might have missed in a static plot?

Answer here:

Section 7: Putting It Together: A Mini Dashboard (Subplots)

Plotly also makes it very easy to combine multiple interactive charts into a single dashboard-style view. This is useful when you want to compare several patterns at once, show different summaries of the same dataset, or tell a bigger story with your data.

We’ll use the function subplot() to place several plot_ly() objects in one layout.

Let’s build three separate plots, then combine them.

Example: Three plots of mtcars in one dashboard

# 1. Scatter: weight vs. MPG, colored by cylinders
p_scatter <- plot_ly(
  data = mtcars,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  color = ~factor(cyl),
  colors = "Set1",
  marker = list(size = 10, opacity = 0.8),
  text = ~paste(
    "Model:", rownames(mtcars),
    "<br>Weight:", wt,
    "<br>MPG:", mpg,
    "<br>Cylinders:", cyl
  ),
  hoverinfo = "text"
) %>%
  layout(
    title = "Weight vs MPG",
    xaxis = list(title = "Weight"),
    yaxis = list(title = "Miles per Gallon")
  )

# 2. Boxplot: MPG by cylinder count
p_box <- plot_ly(
  data = mtcars,
  x = ~factor(cyl),
  y = ~mpg,
  type = "box",
  color = ~factor(cyl),
  colors = "Set1"
) %>%
  layout(
    title = "MPG Distribution by Cylinders",
    xaxis = list(title = "Cylinders"),
    yaxis = list(title = "MPG")
  )

# 3. Bar chart: count of cars by # of gears
gear_counts <- mtcars %>%
  count(gear)

p_bar <- plot_ly(
  data = gear_counts,
  x = ~factor(gear),
  y = ~n,
  type = "bar"
) %>%
  layout(
    title = "Number of Cars by Gears",
    xaxis = list(title = "Gears"),
    yaxis = list(title = "Count of cars")
  )
# Combine them into a 2x2 dashboard (one empty cell)

dashboard <-subplot(
  p_scatter, p_box,       # first row
  p_bar, NULL,      # second row (empty space on right)
  nrows = 2,              # Arrange plots in 2 rows
  
  titleX = TRUE,
  titleY = TRUE,
  
  margin = 0.05        # space between plots
)

dashboard

Try: change the parameters and make it the way you like!

We can further improve readability by adjusting the overall layout

dashboard %>%
  layout(
    title = list(
      text = "Dashboard: Exploring mtcars",
      font = list(size = 22)
    ),
    paper_bgcolor = "white",               # background for whole dashboard
    plot_bgcolor = "rgba(245,245,245,0.6)" # background of each subplot
  )

Reflection

  • What story or pattern does the dashboard help you see?
  • Did placing the plots together reveal patterns you didn’t see before?

Answer here:

Section 8: Linking Plots Plotly also supports linked brushing, which means:

  • When you select points in one plot, the same points are highlighted in other plots.

Example

  1. First create three separate plots (p1, p2, p3), then use subplot() to place them side by side in a single dashboard.
# Scatter plot: relationship between weight and mpg
p1 <- plot_ly(
  data = mtcars,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  color = ~factor(cyl),
  text = ~paste("Model:", rownames(mtcars)),
  hoverinfo = "text"
) %>%
  layout(title = "Weight vs MPG (colored by cylinders)")

# Boxplot: MPG distribution by cylinders
p2 <- plot_ly(
  data = mtcars,
  x = ~factor(cyl),
  y = ~mpg,
  type = "box",
  color = ~factor(cyl)
) %>%
  layout(title = "MPG Distribution by Cylinders")

# Histogram: distribution of car weights
p3 <- plot_ly(
  data = mtcars,
  x = ~wt,
  type = "histogram"
) %>%
  layout(title = "Distribution of Car Weights")

subplot(p1, p2, p3, nrows = 1, margin = 0.05)
  1. Make them linked/coordinated plots using crosstalk package.

crosstalk is a package that allows different Plotly plots to communicate with each other. It enables: - linked brushing (select points in one plot → highlights in others) shared filtering

  • These plots now show the same dataset, but now they are linked. (If you select a group of points in one plot, the same observations will be highlighted in the others.)
library(crosstalk)
# Add a shared key to link the plots
shared <- SharedData$new(mtcars)

# Linked scatter plot
pA <- plot_ly(
  shared,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  color = ~factor(cyl)
) %>%
  layout(title = "Scatter: Weight vs MPG")

# Linked boxplot
pB <- plot_ly(
  shared,
  x = ~factor(cyl),
  y = ~mpg,
  type = "box",
  color = ~factor(cyl)
) %>%
  layout(title = "Boxplot: MPG by Cylinders")

# Linked histogram
pC <- plot_ly(
  shared,
  x = ~wt,
  type = "histogram"
) %>%
  layout(title = "Histogram: Weight Distribution")

subplot(pA, pB, pC, nrows = 1)

Reflection

  • What are some more advantages when the plots are linked?

Answer here:

Section 8: Optional Challenge

Use the new skills you learned from this lesson to build your own mini interactive dashboard using a dataset you think is interesting and answer a question that is meaningful.

AI useage

  • AI assistance was used to improve grammar, clarity, and spelling.

  • AI assistance was used for formatting